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Abstract 

Items with the highest discrimination parameter values in a logistic item response 
theory model do not necessarily give maximum information. In this paper it is 
derived which discrimination parameter values as a function of the guessing 
parameter and the distance between person ability and item difficulty, give maxi- 
mum information for the three-parameter logistic item response theory model. An 
upperbound for the information as a function of these parameters is derived. This 
upperbound for the information function is used to formulate a fast item selection 
algorithm for adaptive testing. In a small simulation study this algorithm was one 
and half to six times as fast as an algorithm in which the information of all 
items in an item bank is calculated. 



Key words: adaptive testing, attenuation paradox, item selection, information 
function, discrimination parameter, logistic IRT model 
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A Simple and Fast Item Sekction Procedure for Adaptive Testing 

One of the major features of the information function in item response theory 
(IRT) is that it can be used for the selection of items from item banks. This can 
be done sequentially during test adminstration as is the case in computerized 
adaptive testing (CAT) (Lord, 1980; Wainer, Dorans, Flaugher. Green, Mislevy, 
Steinberg, & Thissen, 1990). 

Lord (1977) proposed a maximum information selection criterion for adaptive 
testing. For the two- and three-parameter logistic (2PL and 3PL) IRT models it 
can be inferred that an increase of the item discrimination parameter will lead 
to an increase of information. Lord (1980, Eq. 10-6) showed that for the 2PL and 
3PL models the maximum obtainable item information is an increasing function 
of the squared item discrimination parameter as long as item difficulty and 
person ability 0 are optimally matched. For the 2PL model, maximum informa- 
tion is obtained when item difficulty is equal to person ability. It can also be 
shown that the area under the item information function for the 2PL model is 
equal to the discrimination parameter. For the three-parameter model, a similar 
relation can be found (sec Bimbaum, 1968, Eq. 20.4.26). 



Insert Figure 1 about here 



Figure 1 shows item information functions on a (0-b^)-scale for different 
values of the discrimination parameter for the 2PL model. It can be seen that 
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increasing the value of the discrimination parameter will lead to a higher but also 
more peaked information function. This phenomenon shows that the area under 
the information function is concentrated in a smaller range of ability values, i.e., 
the width of the information function becomes smaller as the discrimination 
parameter increases. 

Samejima (1994) has shown that the area under the square root of the 
information function for the 2PL model is equal to k (k « 3.14), irrespective of 
the value of the discrimination parameter. This implies that in the 2PL model the 
information functions of two items must aoss at least once. For reasons of 
symmeuy, the information functions of two items with the same difficulty 
parameter, but with different discrimination parameters, must cross twice. This 
fact is shown in Figure 1. 

Figure 1 also shows that, when item difficulty is not equal to person ability, 
an extreme increase of item discrimination may lead to a decrease of item 
information. This effect has been referred to as the attenuation paradox in IRT by 
Lord and Novick (1968, p. 368) and Bimbaum (1968, p. 465). 



Insert Figure 2 about here 



Figure 2 depicts item information for the 2PL model as a function of item 
discrimination for different values of the distance between person ability and item 
difficulty. The well-known fact that an item with a high discrimination parameter 
is not necessarily the most informative item, and that, therefore, selection of 
items in an adaptive test should not solely be based on the discrimination 
parameter, can also be seen in Figure 2. 
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In this paper, it is shown which discrimination parameter values give maxi- 
mum information, and how high this maximum information is. Both the optimal 
discrimination as well as the maximum attainable information are functions of the 
distance between item difficulty and person ability for logistic IRT models. The 
results of this paper are implemented in an item selection algorithm for adaptive 
testing, and a small simulation study will show that this algorithm will improve 
item selection. 



Derivation of Optimal Item Discrimination Parameter Values 

In this section, it is determined which value of the item discrimination 
parameter is the most informative one, given certain fixed values of the other 
item parameters and tlie person ability parameter for the 3PL IRT model. The 
corresponding maximum information values will also be given. The 2PL model is 
a special case of the 3PL model, and the results will therefore also hold for the 
2PL model. 

The item characteristic curve of the 3PL IRT model is (Lord, 1980, 

Eq. 4-37): 

p,(0) = c,- H) 

1+e ' 

where = a^{Q-b^), and a,eR'*', and c/;l0.1) are the discrimination, 

difficulty and guessing parameter, respectively, and is the ability parameter. 
B and are sets of real and positive real numbers, resfjectively. In Equation 1, 
P^*(0) denotes the probability of a correct response to item i for a person 
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ability 0. The corresponding item information function is given by: 



//( 0 ) 




( 2 ) 



(see e.g. Lord, 1980, Eq. 4-43). The value of for which a maximum 
information function value is reached for fixed values of and (0-f?p is found 
by setting the derivative of the natural logarithm of the information function with 
respect to equal to zero, i.e.: 



aiog[/,(0)] 



2 






+ 2 !_ = 0 . 



(3) 



1 +e 



-L; 



After some elementary operations, this equation can be reduced to: 



L: 2L: 

2c,(l+L,)+(2(c, + l)+L,)e '+(2-L,)e ' 



= 0 



(4) 



Insert Table 1 about here 



It can be shown that solutions of Equation 4 must lie between -6 and 3 (see 
the appendix for prooO- Hence, Equation 4 can be solved iteratively, substituting 
real numbers for L- between -6 and 3. This leads to two solutions of Lj, namely 
one for (0-h,) > 0 and one for (0-hj) < 0. These two optimal values of L- are 
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given in Table 1 for c-values ranging from 0.0 to 0.9 with steps 0.1. From these 
optimal L -values the corresponding optimal a -values can also be derived. If, for 
example, c- = 0.1 and (Q-bj) = -2, then the optimal a-value will be -1.816 / (-2) 
= 0.908. This value is depicted as a cross in Figure 3. The optimal a-values for 
Cj = 0 and for Cj = 0.9 are shown as functions of (9-b,) in Figure 3. All lines for 
0 < Cj < 0.9 lie between the line of Cj = 0 and the line of Cj = 0.9. 



Insert Figure 3 about here 



The values of in Table 1 can be obtained by substituting the 

values for c- and the optimal values for a - and in equation (2). An upperbound 
for /j(0) can be found by dividing I-(Q)(Q-b-)" by (O-bp . As an example, for 
items with c- = 0.1, the information at ability level 0 < b- can be at most: 

I -1.816/(e“&,)I^ ^ 0 222 (5j 

This fact means that the information for items with c- = 0.1 and (Q-b-) = -2 is 
always less than 0.055. This value is shown as a cross in Figure 4, together with 
the upperbound information functions, for two different. values of c-, namely c- = 
0 and c- = 0.6. Similar lines can be drawn for 0 < c- < 0.6 and these lines all lie 
between the two previous lines in Figure 4. For c- > 0.6, the lines lie below the 
line of c- = 0.6. 
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Insert Figim. 4 about here 



The results in Figure 4 and Table 1 show that three factors determine the 
value of the upperbound information, namely 1 L sign(0-bp and Cy. The 
second factor denotes whether (0-by) is greater than or less than 0. Keeping the 
other factors constant, the maximum information is a decreasing function of both 
1 1 and c^*, and it is higher for (Q-b-) > 0 than for (0-bp < 0. 

For example, for items with >0.1 and (0-bp < -2, the maximum informa- 
tion is less than the maximum information for items with = 0.1 and 
(0-bp = -2, which in turn are less informative than items with = 0.1 and 
(0-bp = 2. As a consequence, when we have items with guessing parameter 
values c- > 0.1 which are more difficult than a person’s ability by at least two 
units, i.e. (0-bp < -2, then thei. information /^<0) < 0.055. For item with the 
same guessing parameter values, but with (0-bp > 2, the information will be 
/^<0) < 0.392/4 = 0.098. 



Insert Figure 5 about here 



Finally, in Figure 5 the upperbound information times the squared distance 
between person ability 0 and item difficulty b is shown as a function of the 
guessing parameter. These lines are based on the results given in Table 1. 
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An Item Selection Algorithm 
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The upperbounds for the information derived in the previous section can be 
used to improve item selection for adaptive testing. Suppose that we want to 
select items from an item bank in an adaptive test with Lord’s (1977) maximum 
information criterion. This criterion selects the item which has the highest value 
of the information function at a certain value 6q on the ability scale, usually 
some provisional ability estimate. An heuristic to select the maximum informative 
item at 9q is to calculate the information of all items for this value 9q. It is not 
necessary, however, to compute the information of all items in an item bank to 
determine which one has maximum information. 

This can be seen as follows. Let be the smallest guessing parameter 
value among all items in the itembank. Let Iniax.+^^min^ 

upperbounds of Positive and negaUve values of (QQ-b-), 

respective!. These upperbounds can be found in Table 1 under the heading 
/j<9o)(9o-fe,)" for values of under the heading c-. So, for example. 



Imax +(01) = 0.392 and .(0.1) = 0.222. In the foUowing 1^ wiU either be 



^max,+^^min^ ^max. 

The following result for two items i and j can be derived: 




if (9o-b,)^ > 1^— . then /((Oq) < //Oq) . 




( 6 ) 



This follows from: 



12 



Item Selection 
9 




(%-biy 



(7) 



and 



//(eo)(eo-^)^<^I 



max 



7 /( 00 ) < 




( 8 ) 



The left hand side of Equation 8 follows from the definition of as an 
upperbound of Z,<9Q)(0o-Z>j)^- 

So, as soon as it is determined that a certain item j has a certain information 



have less information than item j, no matter what values for a- and Cj > are 
encountered. 

The algorithm 

The algorithm has the following initialisation steps: 

1. Order the N items in the itembank according to their difficulties: 
bi<b2< ... < V- 

2. Determine the smallest guessing parameter value in the itembank: 

3. Compute the constants Z^^ and Z^^ ,(c^„). 

If 0Q is an provisional ability estimate after the administration of a set of 
items in an adaptive test, then the selection of the next item with maximum 
information on 0 q from the set of items in the itembank not already included in 
the test, consists of the following steps: 



2 

/ Yfl/xV then all iteim i with either oositive and (Qf\-b;) > 




7mzjc,+'^ mm" ; 



O 
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Step 1: Search for item j with difficulty value equal to the smallest positive 
difference (QQ-bj) among all items. 

If 0Q < or 0Q > b/^ item j is the easiest respectively the most 
difficult item. 

Step 2 : Search through the itembank for items i in increasing order of l0Q-b,l, 
for positive values of (0Q-bj). 

If (0Q-b,)^ > 

Otherwise, compute /j(0), if /,(8o^ ^ continue the 

search. 

Step 3 : Search through the itembank for items i in increasing order of i0Q-bjl, 
for negative values of (0Q-bj). 

If (00-b,)^ > searching. 

Otherwise, compute /j(0), if //(Oq) > continue the 

search. 

Step 1 can be sped up somewhat by starting the search for item j with an 
item that is expected to have a difficulty close to the difficulty of item j. Note 
that in Steps 2 and 3 the index j represents the most informative item which is 
eventtially administered to the test-taker. The answer to this item is scored, and 
the test-taker’s ability is re-estimated. Item j is then removed from the item bank. 

In conclusion, this search process only passes through a part of the itembank, 
i.e. the infonnation is computed only for items with relatively small values of 
(00-b,)^. This strategy will speed up item selecdon considerably. 
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A Simulation Study 

To establish the relative speed of this item selection algorithm, a simulation 
study was performed in which the above described algorithm was compared to 
the algorithm with straightforward calculation of the information of all items in 
an item bank. For both algorithms Lord’s maximum information criterion was 
used to select items. This choice means that for each test exactly the same items, 
in the same order, were selected by the two algorithms. So, the algorithms only 
differed in their CPU-times. The simulations were performed using a program 
written in Borland Pascal 7.0, and were run on a 486DX2/66MHz computer. 

Design 

For seven different ability values, 0 = -3, -2, -1, 0, 1, 2, and 3, respectively, 
adaptive tests of 30 items were simulated. These simulaUons were repeated for 
three different itembanks. Each itembank consisted of 200 items. In all banks, the 
distributions of the discrimination parameters and guessing parameters were 
uniform. The discrimination parameters were uniformly distributed between 0.5 
and 2, i.e. a,- - 17(0.5,2), and the guessing parameters between 0.1 and 0.3, i.e. 
c- - 17(0.1,0.3). The three itembanks differed only in their distributions of the 
difficulty parameters b^. One bank had a uniform t/(-3,3) distribution of item 
difficulUes, and in the other two banks the item difficulUes were normally 
distributed, one with a variance of 1, and the other with a variance of 3. So, the 
distributions of item difficulties were f/(-3,3), A/(0,1), and A/(0,3), respectively. 
The number of replications for each condition, i.e., each combination of ability 
level and itembank, was 100. 




15 



Item Selection 
12 



Results 

In Figure 6, the total amount of CPU-dme used to select 30 items in 100 tests 
is shown as a function of 0 for the two algorithms and the three item banks. The 
three lines a little above 40 seconds show the CPU-time needed to calculate the 
information of all items in the bank not already included in the test. The other 
three lines show the CPU-time needed by the proposed algorithm to select the 
items. 

This algorithm improved item selection speed with a factor between 1.5 and 
6. For a uniform distribution of the difficulty parameters the relative speed did 
not depend much on the ability value of the examinee, but for the item banks 
with normally distributed difficulties the speed did largely depend on the ability 
value. For 0 = 0 the amount of CPU-time was relatively high, because there 
were relatively many items with a difficulty parameter in this area. So, in these 
cases the information had to be computed for relatively many items. 

The high CPU-times for extreme negative ability values for the N{0,1)- 
distribution of item difficulties may be explained as follows. The item difficulties 
are thinly spread around -3. The information of the most informative item j in the 
search process was usually not very high, because most of the time (dQ-bj) < 0, 
and I ^Q-bj 1 is large. So, the range of difficulties of items that can be more 
informative than item j is rather large. On the other hand, for extreme posiuve 
ability values the CPU-times were not that high, because the information for 
items j with large distances between 0q and bj is higher for (0Q-by) > 0, than for 
(0Q-by) < 0. Note that for the /V(0,3)-distribution, the iy-values were spread out 
much more than for the ^(O,l)-distribution, and that this effect did thus not 
occur. 
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Insert Figure 6 about here 



Summary and Conclusion 

When item difficulty and person ability are not optimally matched, the 
optimal discrimination parameter value in logistic item response models is not its 
maximum value. This fact has been referred to as the attenuation paradox in item 
response theory. In this paper it has been shown that the optimal discrimination 
parameter value is inversely related to the distance between item difficulty and 
person ability, and it Ls derived which discrimination parameter value provides 
maximum information at certain points on the ability scale. The corresponding 
maximum information is inversely related to the squared distance between person 
ability and item difficulty. 

The relation between this distance and an upperbound on information can be 
used in an algorithm for the maximum information item selection criterion for 
adaptive testing. In a small simulation study this algorithm was 1.5-6 times as 
fast as a more simple and straightforward algorithm. The difference between 
these algorithms is that in the simple algorithm the information function values 
for all items is determined, and in the proposed algorithm information is calcu- 
lated for a relatively small subset of these items. 

It may be argued that this algorithm will not be of much use when so-called 
info tables (Thissen & Mislcvy, 1990, pp. 116-117) are used, where the informa- 
tion of the items is computed in advance, i.e. before the actual testing. This is not 
entirely true, because the computations needed to set up an info table can also be 
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sped up by means of the proposed algorithm. Moreover, when itembanks are 
frequently changed by inclusion of newly calibrated items or exclusion of 
rejected ones, this algorithm can be used to compute updated info tables. 

Finally, Kingsbury & Zara (1989), and Stocking & Swanson (1993), among 
others, have described constrained adaptive testing procedures. Their procedures 
may take much time when a lot of constraints are incorponited in the selection 
process. So, it may be worthwhile to investigate the possibility of incorporating 
the algorithm proposed in this paper into constrained adaptive testing. 
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Table 1 

2 

Optimal L-values and Corresponding Maxima for /,(6)(6-i>j) 
for Different c- values Given Fixed (6-fcj)- Values 







(Q-b-) < 0 


(0-fc-) > 0 






/,<0)(0-l?,)2 


L- /,<0)(0-i>,)2 


0.0 


-2.399 


0.440 


2.399 


0.440 


0.1 


-1.816 


0.222 


2.417 


0.392 


0.2 


-1.669 


0.145 


2.434 


0.346 


0.3 


-1.591 


0.101 


2.451 


0.300 


0.4 


-1.541 


0.073 


2.467 


0.255 


0.5 


-1.505 


0.052 


2.482 


0.211 


0.6 


-1.478 


0.037 


2.497 


0.168 


0.7 


-1.457 


0.025 


2.512 


0.125 


0.8 


-1.440 


0.015 


2.526 


0.083 


0.9 


-1.427 


0.007 


2.540 


0.041 



Note . Values for were rounded upwards to 3 decimals. 
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Figure I . 
Figure 2 . 
Figure 3 . 
Figure 4 . 
Figure 5 . 
Figure 6 . 



Figure captions 

Item information functions. 

Item information as functions of the discrimination parameter. 

Optimal a-value as a function of (0-b). 

Upperbound information as a function of (0-b). 

Upperbound information times (0-b) as a function of c. 

The total CPU-time for two Maximum Information Item Selec- 
tion Algorithms for 1(X) 30-item Adaptive Tests for three 200- 
item Itembanks. 
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Appendix 



Proof that Solutions of Equation 4 Must Lie Between -6 and 3 
This proof is split in the following two parts: (1) proof that solutions of Equation 
4 cannot be less than -6; and (2) proof that solutions of Equation 4 cannot be 
greater than 3. 

(1) Proof that > -6 
Equation 4 can be rewritten as: 



The left part of Equation 9 is negative if < -6, because it consists of three 
parts, which are all negative if < -6: 

(a) the first part, 2cf,l+L^), is less than or equal to 0 if L- < -1. So it is also 
less than 0 if L; < -6; 




(9) 








because c-< 1; and 



L- < -6; [(2-L,)e ‘s negative because it is an increasing 

function of L- for L-< 1, which is negative for L- = -6. 
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(2) Proof that L» < 3 
Equation 4 can be rewritten as: 




2c,- 



2L 

LjC 



i 



+ 




2(c,+l) 





= 0 . ( 10 ) 



The part between brackets in (10) is a decreasing function of L- for L- > 0, and it 
can be found by substition that for L- = 3 this part is smaller than 0. Soothe part 
between brackets is negative if L- > 3. Combining this result with * > 0 if 

L,- > 3 completes the proof that the left part of Equation 10 is negative if L- > 3. 
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